crypto: skcipher - per-tfm multi-data-unit batching#892
crypto: skcipher - per-tfm multi-data-unit batching#892blktests-ci[bot] wants to merge 4 commits into
Conversation
|
Upstream branch: e8c2f9f |
86d8d37 to
9805659
Compare
|
Upstream branch: eb3f4b7 |
91d0379 to
8fd621e
Compare
9805659 to
3f4a345
Compare
|
Upstream branch: 8fde5d1 |
8fd621e to
d955d0e
Compare
3f4a345 to
c6dc343
Compare
|
Upstream branch: e43ffb6 |
d955d0e to
bfb4583
Compare
|
Upstream branch: e43ffb6 |
bfb4583 to
8d4cfde
Compare
c6dc343 to
fc36596
Compare
|
Upstream branch: ba3e43a |
8d4cfde to
eeddfb2
Compare
fc36596 to
7bed9c3
Compare
|
Upstream branch: ddd664b |
eeddfb2 to
e2f2da5
Compare
7bed9c3 to
a7bb5c5
Compare
|
Upstream branch: 979c294 |
e2f2da5 to
af4f752
Compare
a7bb5c5 to
5e41a3b
Compare
|
Upstream branch: acb7500 |
af4f752 to
b7c548f
Compare
5e41a3b to
c3a084b
Compare
|
Upstream branch: 9716c08 |
b7c548f to
92c9bca
Compare
c3a084b to
5f78e5d
Compare
|
Upstream branch: 2a2974b |
92c9bca to
6d3122a
Compare
5f78e5d to
e48f9db
Compare
|
Upstream branch: 062871f |
6d3122a to
77b79ae
Compare
199644a to
e6d9eb8
Compare
|
Upstream branch: 66affa3 |
77b79ae to
0e9aa46
Compare
e6d9eb8 to
7d8604f
Compare
Add a per-tfm data_unit_size and an algorithm capability flag that together allow a caller to submit several data units in a single skcipher request. The IV passed in the request applies to the first data unit; the algorithm advances the tweak between data units according to the mode specification (e.g., LE128 multiply for XTS per IEEE 1619). This mirrors the data_unit_size concept already exposed by struct blk_crypto_config for inline encryption hardware, but at the software skcipher layer. The first user is dm-crypt, which today issues one request per sector and so pays a per-sector cost in request allocation, IV generation, callback dispatch, and completion handling. Allowing the cipher to consume a whole bio per request removes that overhead for drivers that can chain across data units internally. The data_unit_size lives on struct crypto_skcipher rather than on struct skcipher_request because it does not change between requests for any plausible consumer: dm-crypt picks one sector size per mapped target at table load time; fscrypt would pick one per master key. Anchoring it to the tfm also lets the driver validate it once at setkey() time and avoids per-request initialisation hazards on mempool-recycled requests. Capability is advertised with CRYPTO_ALG_SKCIPHER_MULTI_DATA_UNIT in cra_flags (type-specific high-byte range, mirroring the CRYPTO_AHASH_ALG_* convention). This makes the capability visible in /proc/crypto and lets templates OR it into their derived algorithms. crypto_skcipher_set_data_unit_size() returns -EOPNOTSUPP if the algorithm does not advertise the flag, and accepts 0 (the default) unconditionally so callers can re-disable batching cheaply. crypto_skcipher_encrypt()/decrypt() reject requests whose cryptlen is not a multiple of the configured data_unit_size with -EINVAL. The check is gated on data_unit_size != 0 so it costs nothing for the common single-data-unit case. No in-tree algorithm advertises the flag yet; subsequent patches add the generic xts() template, arm64, and x86 producers as well as the dm-crypt consumer. Signed-off-by: Leonid Ravich <lravich@amazon.com>
Teach the generic xts() template to consume cryptlen larger than one data unit when the caller has configured a non-zero data_unit_size on the tfm. Each data unit is processed with its own IV, derived from the caller-supplied IV by treating it as a 128-bit little-endian counter and adding the data-unit index. This matches the sector-indexed XTS used by dm-crypt's plain64 IV mode and by typical inline-encryption hardware. The single-data-unit body is unchanged and is now reached via a thin xts_crypt_multi() dispatcher that skips straight to the body when data_unit_size is zero (the legacy default), so existing users see no extra cost. Advertise CRYPTO_ALG_SKCIPHER_MULTI_DATA_UNIT in cra_flags only when the inner cipher is synchronous. An async inner cipher would require a per-DU completion chain which is out of scope for the slow software template; consumers that need multi-DU on async hardware will use one of the arch-specific drivers added later in this series. Signed-off-by: Leonid Ravich <lravich@amazon.com>
Add a self-comparison test that runs whenever an skcipher algorithm
advertises CRYPTO_ALG_SKCIPHER_MULTI_DATA_UNIT in cra_flags. The test
encrypts the same random plaintext two ways:
1. as one batched request with data_unit_size set, and
2. as N back-to-back single-data-unit requests with IVs derived from
the original IV by adding the data-unit index (treated as a
128-bit little-endian counter, matching the convention documented
in crypto_skcipher_set_data_unit_size()).
Both encrypts must produce byte-identical ciphertext, otherwise the
algorithm's multi-DU implementation is inconsistent with its single-DU
behaviour. Iterates over a fixed set of typical data unit sizes
(512, 1024, 2048, 4096) which cover the dm-crypt sector-size range.
The test is gated on ivsize == 16 (XTS, the only multi-DU consumer in
the kernel today) and on the algorithm advertising the capability,
so it costs nothing for the existing fleet of skcipher drivers.
Signed-off-by: Leonid Ravich <lravich@amazon.com>
When the underlying skcipher driver advertises support for multiple data units in a single request (CRYPTO_ALG_SKCIPHER_MULTI_DATA_UNIT), configure the cipher with cc->sector_size as data_unit_size and submit one request per bio instead of one request per sector. This removes per-sector overhead in the crypto API hot path: request allocation, callback dispatch, completion handling, and SG setup. The optimisation is enabled automatically at table load when all of the following hold: - the cipher is non-aead (i.e. skcipher); - tfms_count is 1 (interleaved per-sector keys would break batching); - the IV mode is plain or plain64 (the only modes whose generator produces a sequential 64-bit little-endian counter that the cipher can extend by adding the data-unit index, matching the convention documented in crypto_skcipher_set_data_unit_size()); - the iv_gen_ops->post() hook is unset (lmk and tcw use it; both are already excluded by the IV-mode test, but the explicit check makes the assumption durable against future IV modes); - dm-integrity is not stacked (no integrity tag or integrity IV); - the cipher driver advertises multi-data-unit support. A new CRYPT_MULTI_DATA_UNIT cipher_flag, set once at construction time, gates the multi-data-unit path. The existing per-sector path in crypt_convert_block_skcipher() is unchanged; the new crypt_convert_block_skcipher_multi() is reached from a small dispatch in crypt_convert() and shares the same backlog/-EBUSY/-EINPROGRESS flow control with the per-sector path. Heap-allocated scatterlists are stashed in dm_crypt_request and freed in crypt_free_req_skcipher() to avoid races between the synchronous- success free path and async-completion reuse from the request pool. On -ENOMEM during scatterlist allocation, the bio is requeued via BLK_STS_DEV_RESOURCE rather than failed, matching the behaviour of the existing -ENOMEM path for crypto request allocation. Verified end-to-end with a byte-equivalence test: encrypted output of plain64 dm-crypt with the multi-data-unit path matches output of the single-data-unit path bit-for-bit over a 256 MB device. Signed-off-by: Leonid Ravich <lravich@amazon.com> Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
|
Upstream branch: bade58e |
0e9aa46 to
d1e5d5d
Compare
Pull request for series with
subject: crypto: skcipher - per-tfm multi-data-unit batching
version: 2
url: https://patchwork.kernel.org/project/linux-block/list/?series=1101425